Search results for "Free sequence"
showing 10 items of 11 documents
Textual data compression in computational biology: Algorithmic techniques
2012
Abstract In a recent review [R. Giancarlo, D. Scaturro, F. Utro, Textual data compression in computational biology: a synopsis, Bioinformatics 25 (2009) 1575–1586] the first systematic organization and presentation of the impact of textual data compression for the analysis of biological data has been given. Its main focus was on a systematic presentation of the key areas of bioinformatics and computational biology where compression has been used together with a technical presentation of how well-known notions from information theory have been adapted to successfully work on biological data. Rather surprisingly, the use of data compression is pervasive in computational biology. Starting from…
Long read alignment based on maximal exact match seeds
2012
Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…
2014
The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Moreover, even slight improvements in this area are important but challenging, and usually require significantly more computational endeavor. In this paper, we present CUSHAW3, an open-source parallelized, sensitive and accurate short-read aligner for both base-space and color-space sequences. In this aligner, we have investigated a hybrid seeding approach to improve alignment quality, which incorp…
Free sequences and the tightness of pseudoradial spaces
2019
Let F(X) be the supremum of cardinalities of free sequences in X. We prove that the radial character of every Lindelof Hausdorff almost radial space X and the set-tightness of every Lindelof Hausdorff space are always bounded above by F(X). We then improve a result of Dow, Juhasz, Soukup, Szentmiklossy and Weiss by proving that if X is a Lindelof Hausdorff space, and $$X_\delta $$ denotes the $$G_\delta $$ topology on X then $$t(X_\delta ) \le 2^{t(X)}$$ . Finally, we exploit this to prove that if X is a Lindelof Hausdorff pseudoradial space then $$F(X_\delta ) \le 2^{F(X)}$$ .
An effective extension of the applicability of alignment-free biological sequence comparison algorithms with Hadoop
2016
Alignment-free methods are one of the mainstays of biological sequence comparison, i.e., the assessment of how similar two biological sequences are to each other, a fundamental and routine task in computational biology and bioinformatics. They have gained popularity since, even on standard desktop machines, they are faster than methods based on alignments. However, with the advent of Next-Generation Sequencing Technologies, datasets whose size, i.e., number of sequences and their total length, is a challenge to the execution of alignment-free methods on those standard machines are quite common. Here, we propose the first paradigm for the computation of k-mer-based alignment-free methods for…
Alignment-Free Sequence Comparison over Hadoop for Computational Biology
2015
Sequence comparison i.e., The assessment of how similar two biological sequences are to each other, is a fundamental and routine task in Computational Biology and Bioinformatics. Classically, alignment methods are the de facto standard for such an assessment. In fact, considerable research efforts for the development of efficient algorithms, both on classic and parallel architectures, has been carried out in the past 50 years. Due to the growing amount of sequence data being produced, a new class of methods has emerged: Alignment-free methods. Research in this ares has become very intense in the past few years, stimulated by the advent of Next Generation Sequencing technologies, since those…
A note on discrete sets
2009
We give several partial positive answers to a question of Juhasz and Szentmiklossy regarding the minimum number of discrete sets required to cover a compact space. We study the relationship between the size of discrete sets, free sequences and their closures with the cardinality of a Hausdorff space, improving known results in the literature.
Upper bounds for the tightness of the $$G_\delta $$-topology
2021
We prove that if X is a regular space with no uncountable free sequences, then the tightness of its $$G_\delta $$ topology is at most the continuum and if X is, in addition, assumed to be Lindelof then its $$G_\delta $$ topology contains no free sequences of length larger then the continuum. We also show that, surprisingly, the higher cardinal generalization of our theorem does not hold, by constructing a regular space with no free sequences of length larger than $$\omega _1$$ , but whose $$G_\delta $$ topology can have arbitrarily large tightness.
On the cardinality of almost discretely Lindelof spaces
2016
A space is said to be almost discretely Lindelof if every discrete subset can be covered by a Lindelof subspace. Juhasz et al. (Weakly linearly Lindelof monotonically normal spaces are Lindelof, preprint, arXiv:1610.04506 ) asked whether every almost discretely Lindelof first-countable Hausdorff space has cardinality at most continuum. We prove that this is the case under $$2^{<{\mathfrak {c}}}={\mathfrak {c}}$$ (which is a consequence of Martin’s Axiom, for example) and for Urysohn spaces in ZFC, thus improving a result by Juhasz et al. (First-countable and almost discretely Lindelof $$T_3$$ spaces have cardinality at most continuum, preprint, arXiv:1612.06651 ). We conclude with a few rel…
A short proof of a theorem of Juhasz
2011
Abstract We give a simple proof of the increasing strengthening of Arhangelʼskii Theorem. Our proof naturally leads to a refinement of this result of Juhasz.